Sound Event Detection in Multisource Environments Using Source Separation
نویسندگان
چکیده
This paper proposes a sound event detection system for natural multisource environments, using a sound source separation front-end. The recognizer aims at detecting sound events from various everyday contexts. The audio is preprocessed using non-negative matrix factorization and separated into four individual signals. Each sound event class is represented by a Hidden Markov Model trained using mel frequency cepstral coefficients extracted from the audio. Each separated signal is used individually for feature extraction and then segmentation and classification of sound events using the Viterbi algorithm. The separation allows detection of a maximum of four overlapping events. The proposed system shows a significant increase in event detection accuracy compared to a system able to output a single sequence of events.
منابع مشابه
Advances in audio source seperation and multisource audio content retrieval
Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. ...
متن کاملSpecial issue on speech separation and recognition in multisource environments
One of the chief difficulties of building distant-microphone speech recognition systems for use in `everyday' applications is that the noise background is typically `multisource'. A speech recognition system designed to operate in a family home, for example, must contend with competing noise from televisions and radios, children playing, vacuum cleaners, and outdoors noises from open windows. D...
متن کاملDecoding sound source location and separation using neural population activity patterns.
The strategies by which the central nervous system decodes the properties of sensory stimuli, such as sound source location, from the responses of a population of neurons are a matter of debate. We show, using the average firing rates of neurons in the inferior colliculus (IC) of awake rabbits, that prevailing decoding models of sound localization (summed population activity and the population ...
متن کاملNeural encoding of sound source location in the presence of a concurrent, spatially separated source.
In the presence of multiple, spatially separated sound sources, the binaural cues used for sound localization in the horizontal plane become distorted from the cues from each sound in isolation, yet localization in everyday multisource acoustic environments remains robust. We examined changes in the azimuth tuning functions of inferior colliculus (IC) neurons in unanesthetized rabbits to a targ...
متن کاملA joint separation-classification model for sound event detection of weakly labelled data
Source separation (SS) aims to separate individual sources from an audio recording. Sound event detection (SED) aims to detect sound events from an audio recording. We propose a joint separation-classification (JSC) model trained only on weakly labelled audio data, that is, only the tags of an audio recording are known but the time of the events are unknown. First, we propose a separation mappi...
متن کامل